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Pantala flavescens (Fabricius, 1798) is one of the most common species of dragonflies and has been 
found throughout from tropic to temperate zones worldwide. In this study, RNA-seq of P. flavescens was 
carried out through Illumina high-throughput sequencing technology. Approximately 37,868 unigenes 
and 47,188 transcripts were obtained. The average length of the assembled unigenes was 908.59 bp. We 
identified 1442 cDNA simple sequence repeats (SSRs) among the 37,868 unigenes, with 864 (59.91%) 
di-nucleotide repeats, 537 (37.32%) tri-nucleotide repeats, 32 (2.22%) complex-nucleotide repeats, and 
9 (0.62%) with tetra-nucleotide repeats. Sixty microsatellite molecular markers were randomly selected 
to test amplification. Of the 60 markers, 32 (53.33%) produced clear amplicons of the expected size, 10 
(16.67%) amplified nonspecific products, and 18 (30%) failed to amplify the DNA products. In order to 
assess their applicability, genetic diversity of the 32 SSR loci was tested in 32 individuals from Nan- 
chang in China. Of these loci, 14 markers were highly polymorphic, with the observed (Ho) and expected 
(He) heterozygosities ranged 0.69 to 0.88 and from 0.96 to 0.98 respectively. PIC ranged from 0.52 to 
0.83. These highly polymorphic loci will be valuable for the genetic analysis of distinct populations of P. 
flavescens. 
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Introduction 


Pantala flavescens (Fabricius, 1798), commonly known as the wandering glider or the globe 
skimmer, may be the most widespread of any known dragonfly species (Russell, May, Soltesz, 
& Fitzpatrick, 1998). Previous studies indicated that P. flavescens was the most abundant species 
of Odonata, and possessed the capability to migrate several thousands of kilometers worldwide 
(Anderson, 2009; Artiss, 2004; Borisov, 2009; Feng, Wu, Ni, Cheng, & Guo, 2006). P. flavescens 
has a nearly cosmopolitan distribution; this species was thought to undertake extensive migration 
following the movement of the Inter-Tropical Convergence Zone (ITCZ; Hobson, Anderson, 
Soto, & Wassenaar, 2012). The larval stage of P. flavescens is very short, about 34—43 days, 
which allowed this species to breed in the ephemeral freshwater pools and floods produced by 
ITCZ rainfall (Hobson et al. 2012). Seasonal migration of P. flavescens occurs in many parts 
of its distribution region, with large swarms during long-distance migration (Srygley, 2003). In 
China, P. flavescens was first reported as migratory species (Feng et al., 2006) and had three 
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movement strategies: wandering around northern China, and north-bound (positive) and south- 
bound (negative) movements, with the majority engaged in wandering around northern China 
(Cao, Fu, Hu, & Wu, 2018). Catling (2005) used dragonflies to measure the efficiency of sewage 
lagoons, and stated that several species worked well as indicators for that purpose. Dragonflies 
have been used as bioindicators of habitat quality and environmental health (Gamboa, Reyes, & 
Arrivillaga, 2008; Martin & Maynou, 2016). Many dragonflies are also considered the important 
natural enemies of many insect pests, such as Anopheles mosquitoes, flies, and gnats (Russell 
et al., 1998). 

Microsatellites or simple sequence repeats (SSRs) are tandem repeated motifs of 1—6 bp in 
length, and they were widely distributed throughout the eukaryotic genomes (Yang, Sun, Xue, 
Zhu, & Hong, 2012). Because of their ubiquity and high level of polymorphism, microsatellite 
markers have been utilized extensively to characterize genetic structure and diversity, to con- 
struct phylogenetic trees and to identify unique sources of allelic diversity (Sun, Li, Yang, & 
Hong, 2012). The traditional techniques for constructing enriched genomic libraries to develop 
SSR markers of P. flavescens were usually low efficiency, labor intensive and time consuming 
(Cao, Fu, & Wu, 2015). Next-generation sequencing technologies and the rapid development of 
high-throughput platforms have raised the bar regarding the sequencing of non-model organ- 
isms (Patnaik et al., 2016). These technologies offer a collated and comprehensive output 
for discovery of novel transcripts for functional studies and molecular marker development 
(Bräutigam, Mullick, Schliesky, & Weber, 2011). In this study, we processed a large number 
of high-quality transcriptome sequences and discovered a set of SSRs of P. flavescens using 
the Illumina HiSeq 4000 platform. The genetic characteristic of SSR markers of P. flavescens 
was analyzed and further application in genetic diversity analysis was validated. Finally, we 
obtained 14 highly polymorphic microsatellite markers that would allow researchers to inves- 
tigate the genetic diversity and population genetic structure of P. flavescens in China and 
worldwide. 


Materials and methods 


Sample collection 


In total, 32 P. flavescens were collected on 12 July 2016 from Nanchang, Jiangxi 
province, China. The species had been evaluated as least concern species by IUCN 
(www.iucnredlist.org/species/5997 1/658 18523#threats). DNA was extracted from the thoracic 
muscle to analyze genetic diversity. Total RNA was extracted from 10 heads for RNA-seq. 


cDNA library construction and Illumina sequencing 


Total pooled RNA samples were sent to Shanghai Major Medical Laboratory Ltd (Shanghai, 
China) for mRNA cDNA-seq library construction and sequencing using Illumina HiSeq 4000 
(Illumina Inc. San Diego, CA, USA) sequencing platform. In brief, the cDNA library of pooled 
RNA was obtained using Illumina TruSeqTM RNA sample preparation kit (Illumina Inc.), 
according to the manufacturer’s instructions. Poly-T attached magnetic beads (Illumina, Inc.) 
were used to isolate poly-A mRNA from total RNA. The total mRNA was fractured into about 
200 bp fragments randomly by adding fragmentation buffer (Ambion Inc, Austin, TX, USA). 
First-strand cDNA was synthesized from the fragmented mRNA using reverse transcriptase 
(Invitrogen Inc., Carlsbad, CA, USA) and random primers. In the following, the synthesis of 
the second strand of cDNA was accomplished using DNA polymerase I (New England Biolabs, 
Beijing, China) and RNase H (Invitrogen). Subsequently, the 3’ end of double-stranded cDNA 
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was repaired using T4 DNA polymerase, T4 polynucleotide kinase and Klenow fragment (New 
England BioLabs, Inc.). The end-repaired cDNA was ligated to Illumina paired-end (PE) adapter 
and then enriched by 15 cycles of PCR amplification. The suitable fragments (about 200 bp) were 
sequenced in paired-end pattern on an Illumina HiSeq 4000 sequencing platform. The sequencing 
data were automatically collected and generated into FASTQ format. 


Simple sequence repeat markers discovery and primer design 


The microsatellites in the unigene sequences of P. flavescens were localized by microsatellite 
identification tool (MISA; http://pgrc.ipk-gatersleben.de/misa/). All types of simple sequence 
repeat markers were identified from mononucleotide to tetranucleotide repeats. The selection 
criteria for SSR markers were as follows: the minimum repeat unit was 5 for trinucleotide to 
tetranucleotide and 6 for dinucleotide. Mononucleotides repeats were not regarded as the SSR 
analysis. The SSR loci that were used for genetic markers should include a perfect repeat motif 
and two unique flanking sequences with about 200 bp on each sides of the repeat. 

The forward and reverse primers were designed based on unique flanking sequences using 
Batch Primer 3 (http://perlprimers.sourceforge.net). Primer lengths ranged from 15 to 25 bases 
with optimal sizes of 20 nt. Annealing temperature was between 45-60°C, with 55°C as the 
optimum annealing temperature. PCR product lengths ranged from 100 to 300 bp. 


SSR markers validation and genetic diversity analysis 


To verify the availability of SSR markers, 60 primer pairs were randomly chosen for PCR ampli- 
fication. Genomic DNA was extracted from thoracic muscles of an individual P. flavescens 
using the hexadecyl trimethyl ammonium bromide (CTAB) method (Englen & Kelley, 2000). 
PCR amplification was performed in 25 ul volume containing 50ng of genomic DNA, 0.4 
uM each forward and reverse primer, 12.5 ul 2 x PCRMix using Bio-Rad Thermal Cycler 
(Bio-Rad Laboratories, Inc., Hercules, CA, USA). Cycling conditions were: initial denatura- 
tion at 95°C for 2 min, 35 cycles of 30 s at 94°C, the locus specific melting temperature 
(Tm) for 30 s, 72°C for 30 s, and a final extension at 72°C for 5 min. The amplifica- 
tion products were separated on a 3% agarose gel. The 50bp DNA ladder (Sangon Biotech 
Co., Ltd, Shanghai, China) was used as the standard size marker. For these SSR markers 
with stable and clear bands, each of the forward primers was synthesized with a univer- 
sal adapter (M13-21) on the 5’ end. PCR amplification was performed in the 25 ul volume 
containing three primers (0.3 uM fluorescent dye-labeled adapter, 0.1 uM forward primer 
added adapter and 0.4 uM reverse primer). The reaction conditions were identical to the 
above. PCR amplicons were separated by capillary electrophoresis on ABI a 3730 DNA Ana- 
lyzer (Applied Biosystems, Foster City, CA, USA) along with the GeneScan-500 LIZ size 
standard. 


Data analysis 


Allele data were scored using GENMARKER version 4.0 (Applied Biosystems, Foster City, CA, 
USA). MICRO-CHECKER ver. 2.2.3 (Van Oosterhout, Hutchinson, Derek, & Shipley, 2004) 
was used to assess null alleles and scoring errors. The program PopGene 32 (Yeh, Yang, & 
Boyle, 1999) was used to estimate expected heterozygosities (He), observed heterozygosities 
(Ho) and the number of alleles (Na), departure from HWE. Polymorphism information content 
(PIC) values for each of the SSR markers was calculated using PIC_Calc 0.6. 
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Results 


Illumina HiSeq 4000 and P. flavescens transcriptome 


We obtained the head transcriptome of P. flavescens, with over 47,039,040 raw sequences 
of 7,102,895,040 bases (Q20 percentage of 95.28%). After stringent filtration, there were 
45,908,046 clean reads (Q20 percentage of 97.34%) with an accumulated length of 
6,694,354,464 bases representing 97.60% of the raw reads. The combined sequences of these 
reads were assembled into 37,868 unigenes and 47,188 individual transcripts. The average 
length of the assembled unigenes was 908.59 bp with N50 = 1836bp which was shorter than 
the average length of the assembled transcripts (1112.82 bp with N50 = 2223 bp). Among the 
unigenes, 18,441 (48.7%) were 100—400 bp: 5016 (13.25%) were 401—600 bp; 2710 (7.16%) 
were 600-800 bp; 1774 (4.68%) were 801-1000 bp; and 9927 (26.21%) were more than 1000 bp 
in length (Figure 1). 


Frequency and types of SSR markers 


Among the 37,868 unigenes, 1442 SSRs were identified. Of the unigenes, 1235 contained 
SSRs, and 32 (2.12%) sequences had more than one SSR. The types of repeat motif were not 
evenly distributed; the di-nucleotide repeat motifs were the most abundant at 864 (59.92%), fol- 
lowed by 537 (37.24%) tri-nucleotide repeat motifs, and 9 (0.62%) tetra-nucleotide repeat motifs 
(Figure 2). 

The number of SSR repeats ranged from 5 to 12, with six repeats being the most abundant, 
followed by five and seven repeats as the next most abundant. Motifs with more than 10 repeats 
were rare 12 (0.85%). Among the nucleotide repeats, AT/TA was the most abundant. The dis- 
tributed positions of SSRs were different with 438 SSRs in the coding regions, 537 SSRs in the 
3’UTR, 117 SSRs in the 5’UTR and the others undetermined. 


Development of polymorphic SSR markers 


Sixty SSR markers were randomly selected to test amplification and genetic nature by ana- 
lyzing 32 individuals from Nanchang, Jiangxi province. Of the 60 SSR markers, 32 (53.33%) 
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Figure 1. Frequency length distribution of Ilumina read sequences. 
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Summary of the number of repeat units in P. flavescens SSR loci. 


produced clear bands of the expected size, 10 (16.67%) amplified non-specific products, and 18 
(30%) failed to amplify the DNA product. Among the 32 SSR markers, 10 showed no polymor- 
phisms, eight showed moderate polymorphisms and 14 showed high polymorphisms in the 32 
individuals. These 14 high polymorphism loci (PIC > 0.5) were submitted to Genbank and their 
characteristics are outlined in Table 1. Among 32 individuals of P. flavescens, 126 alleles in total 
were detected for 14 high polymorphism loci with allele numbers varying between 5 and 14 with 





Table 1. Characterization of 14 high polymorphic microsatellite loci of P. flavescens. 

Locus Primer sequences (5’—3’) Repeat motif Size range (bp) Tm (°C) Accession no. 

SSR2 F:*TCAAAAACTGTAAACCCGCC (CT)6 179-228 32 MF002465 
R:GGGCAATGAATGCAGAAAAG 

SSR5 F:*CAAAATTGGAGCAGCACTGA (ATG)6 225-238 52 MF002466 
R:GCCGCCGTCATACAGATATT 

SSR7 F:*GGAGCCAGCTGTTTTTATCG (AG)6 162-270 52 MF002467 
R:ATCTCAGCCACCCCTACCTT 

SSR8 F:*CACTTACATAATGGGTGAATCCTG (AT)6 152-173 52 MF002468 
R:AGGCAACTACATTATGCCCG 

SSR10  -F:*TCAGGTTCAGCCTAGGTGCT (AT)9 128-322 52 MF002469 
R:TGCAGCATTACCTTCAGCAT 

SSRIL  F:*ACCGATCAAAACGAAGCAAT (AT)9 136-156 52 MF002470 
R:TGACGTCACCCTTTAGTGTCAT 

SSR14.— F:*GAGCGATAGGAAAAGAGGGG (AG)8 205-213 52 MF002471 
R:GACTCTGTTGGGGTTCGTGT 

SSR16 — -F:*TTTTTCCAAAGGTTTTCCTGA (CT)8 242-256 50 MF002472 
R:TGCTACATGAAGGAGCAATGA 

SSR17 F:*TCACACCTTTCCACACTTGC (AT)8 236-245 50 MF002473 
R:CACTGTAAAAAGAAATGTTGACCA 

SSR18 — F:*GATTGGGAAATGAGCTGATGA (TA)10 216-285 50 MF002474 
R:ACACACAACAATGCTGCCAT 

SSR19 F:*GGAGACACCTTTTGTGAGTCG (TA)7 116-164 55 MF002475 
R:TGTTGTTGCCTGTGTCACCT 

SSR20 — F:*CCTCGACTCTGGATCTCCACT (ATC)7 191-203 55 MF002476 
R:CCAATGTGGATTCTTCGCTT 

SSR21 — F:*GCTCAGAGCACTTCAGCATTT (TA)8 274-296 55 MF002477 
R:GAGTATAAATTGCACAGCCAAAAA 

SSR22 F:*CGGGAGAGAGGGCTTAAGAG (AC)7 105-193 50 MF002478 


Notes: Tm, annealing temperature; accession no.: GenBank accession number. 


R:CAGACTCTTGGCAGTGAGATG 


226 L.Z. Cao and Y.G. Yuan 


Table 2. Summary of genetic diversity of 14 SSRs in 32 individuals of P. flavescens. 


Loci Na Ho He PIC I H-W Loci Na Ho He PIC I H-W 





SSR2 14 0.72 0.97 0.83 2.15 0.59 SSR16 
SSR5 6 0.75 0.98 0.64 1.36 0.07 SSR17 
SSR7 11 0.81 0.98 0.62 1.57 0.04 SSR18 
SSR8 10 0.78 0.98 0.7 1.67 0.73 SSR19 
SSR10 12 0.81 0.98 0.66 1.67 1 SSR20 
SSRI1 9 0.69 0.96 0.62 1.47 0.99 SSR21 
SSR14 5 0.72 0.97 0.68 1.43 0.00 SSR22 


= 
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0.78 0.97 0.8 2.05 0.45 
0.72 0.97 0.55 1.18 0.00 
0.88 0.97 0.7 1.6 0.03 
0.62 1,3 0 
0.81 0.97 0.8 1.96 0.01 
0.81 0.96 0.7 1.61 0.9 
0.69 0.97 0.52 1.11 0.02 


= 


a mean of 9 per marker. The Ho and He ranged from 0.69 to 0.88 and from 0.96 to 0.98, respec- 
tively. Shannon’s diversity index ranged from 1.11 to 2.15. Four SSRs (SSR7, SSR18, SSR20 
and SSR22) deviated from Hardy-Weinberg equilibrium, and three SSRs (SSR14, SSR17 and 
SSR19) significantly deviated from Hardy—Weinberg equilibrium. PIC varies from 0.52 to 0.83 
with average 0.57. The genetic diversities for these SSR markers are given in Table 2. 


Discussion 


With a nearly global distribution, P. flavescens may be the most widespread of any known 
dragonfly species (Hobson et al., 2012; Russell et al., 1998). Isotopic evidence suggested that 
the multigenerational journey may total over 18,000 km with single individuals traveling over 
6000 km during the transoceanic trek from northern India to east Africa (Hobson et al., 2012). 
Migratory behavior of P. flavescens presented a unique opportunity to ask questions regarding 
the amount of gene flow that may be occurring on a global scale as well as its influence on both 
the population structure and genetic diversity of the species. 

Previous studies suggested low genetic diversity and a high rate of gene flow among five 
geographically isolated populations of P. flavescens within India using randomly amplified poly- 
morphic DNA (Christudhas & Mathai, 2014). Troast, Suhling, Jinguji, Sahlén, and Ware (2016) 
suggested high rates of gene flow were occurring among all included geographic regions and 
genes were being shared among individuals across the globe using PCR-amplified cytochrome 
oxidase one (CO1) mitochondrial DNA data. 

Although the use of the mitochondrial CO1 gene has proven to be an effective marker for 
studying divergence within and among species (Hebert, Ratnasingham & DeWaard, 2003), 
microsatellites might help to evaluate any findings uncovered thus far. Genetic analyses with 
more loci over an even wider geographic range will be required to address this question. 

In this study, we aim to develop a set of microsatellite loci for genetic analysis of distinct 
populations of P. flavescens from large geographical scales. Our result showed that transcrip- 
tome sequencing was a very useful tool for SSR development compared with screening partial 
genomic libraries enriched (Cao et al., 2015). Fourteen highly polymorphic SSR markers were 
obtained among 60 SSRs selected randomly from the unigenes. The set of reliable SSR mark- 
ers might be used to research genetic structure and genetic diversity of migratory Odonata in a 
wider geographic region. In addition, the potential ecological impacts of the migratory behavior 
are worthy of further discussion. 
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